Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features

نویسندگان

Jinxi Guo

Gary Yeung

Deepak Muralidharan

Harish Arsikere

Amber Afshan

Abeer Alwan

چکیده

Speaker verification in real-world applications sometimes deals with limited duration of enrollment and/or test data. MFCC-based i-vector systems have defined the state-of-the-art for speaker verification, but it is well known that they are less effective with short utterances. To address this issue, we propose a method to leverage the speaker specificity and stationarity of subglottal acoustics. First, we present a deep neural network (DNN) based approach to estimate subglottal features from speech signals. The approach involves training a DNN-regression model that maps the log filter-bank coefficients of a given speech signal to those of its corresponding subglottal signal. Cross-validation experiments on the WashU-UCLA corpus (which contains parallel recordings of speech and subglottal acoustics) show the effectiveness of our DNN-based estimation algorithm. The average correlation coefficient between the actual and estimated subglottal filter-bank coefficients is 0.9. A scorelevel fusion of MFCC and subglottal-feature systems in the ivector PLDA framework yields statistically-significant improvements over the MFCC-only baseline. On the NIST SRE 08 truncated 10sec-10sec and 5sec-5sec core evaluation tasks, the relative reduction in equal error rate ranges between 6 and 14% for the conditions tested with both microphone and telephone speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances

We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...

متن کامل

DNN based Speaker Recognition on Short Utterances

This paper investigates the effects of limited speech data in the context of speaker verification using deep neural network (DNN) approach. Being able to reduce the length of required speech data is important to the development of speaker verification system in real world applications. The experimental studies have found that DNN-senone-based Gaussian probabilistic linear discriminant analysis ...

متن کامل

Speaker recognition via fusion of subglottal features and MFCCs

Motivated by the speaker-specificity and stationarity of subglottal acoustics, this paper investigates the utility of subglottal cepstral coefficients (SGCCs) for speaker identification (SID) and verification (SV). SGCCs can be computed using accelerometer recordings of subglottal acoustics, but such an approach is infeasible in real-world scenarios. To estimate SGCCs from speech signals, we ad...

متن کامل

Channel robust speaker verification via Bayesian blind stochastic feature transformation

In telephone-based speaker verification, the channel conditions can be varied significantly from sessions to sessions. Therefore, it is desirable to estimate the channel conditions online and compensate the acoustic distortion without prior knowledge of the channel characteristics. Because no a priori knowledge is used, the estimation accuracy depends greatly on the length of the verification u...

متن کامل

Deep Speaker: an End-to-End Neural Speaker Embedding System

We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity. The embeddings generated by Deep Speaker can be used for many tasks, including speaker identification, verification, and clustering. We experiment with ResCNN and GRU architectures to extract the acoustic features, then mean pool to produce ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Speaker Verification Using Short Utterances with DNN-Based Estimation of Subglottal Acoustic Features

نویسندگان

چکیده

منابع مشابه

DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances

DNN based Speaker Recognition on Short Utterances

Speaker recognition via fusion of subglottal features and MFCCs

Channel robust speaker verification via Bayesian blind stochastic feature transformation

Deep Speaker: an End-to-End Neural Speaker Embedding System

عنوان ژورنال:

اشتراک گذاری